Search results for " Formal languages"

showing 10 items of 79 documents

Alignment-free sequence comparison using absent words

2018

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…

0301 basic medicineFOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySequence alignmentInformation System0102 computer and information sciencesCircular wordAbsent words01 natural sciencesUpper and lower boundsSequence comparisonTheoretical Computer ScienceCombinatorics03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Absent wordCircular wordsMathematicsSequenceSettore INF/01 - InformaticaProcess (computing)q-gramComputer Science Applications1707 Computer Vision and Pattern Recognitionq-gramsComposition (combinatorics)Computer Science Applications030104 developmental biologyComputational Theory and MathematicsForbidden words010201 computation theory & mathematicsFocus (optics)Forbidden wordWord (computer architecture)Information SystemsInteger (computer science)

researchProduct

On Combinatorial Generation of Prefix Normal Words

2014

A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present an efficient algorithm for exhaustively listing the prefix normal words with a fixed length. The algorithm is based on the fact that the language of prefix normal words is a bubble language, a class of binary languages with the property that, for any word w in the language, exchanging the first occurrence of 01 by 10 in w results in another word in the language. We prove that each prefix normal word is produced in O(n) amortized time, and conjecture, based on expe…

Amortized analysisConjecturePrefix Normal WordBinary numbercombinatorial generation; formal languages; prefix normal words; binary strings; jumbled pattern matching; bubble languages; efficient algorithmsContext (language use)prefix normal wordsData_CODINGANDINFORMATIONTHEORYformal languagesbubble languagesSubstringcombinatorial generationbinary stringsPrefixCombinatoricsjumbled pattern matchingefficient algorithmsPattern matchingAlgorithmsWord (computer architecture)Mathematics

researchProduct

Isometric Words Based on Swap and Mismatch Distance

2023

An edit distance is a metric between words that quantifies how two words differ by counting the number of edit operations needed to transform one word into the other one. A word f is said isometric with respect to an edit distance if, for any pair of f-free words u and v, there exists a transformation of minimal length from u to v via the related edit operations such that all the intermediate words are also f-free. The adjective 'isometric' comes from the fact that, if the Hamming distance is considered (i.e., only mismatches), then isometric words are connected with definitions of isometric subgraphs of hypercubes. We consider the case of edit distance with swap and mismatch. We compare it…

FOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySwap and mismatch distance Isometric words Overlap with errors

researchProduct

Probabilistic verification of all languages

2018

We present three protocols for verifying all languages: (i) For any unary (binary) language, there is a log-space (linear-space) interactive proof system (IPS); (ii) for any language, there is a constant-space weak-IPS (the non-members may not be rejected with high probability); and, (iii) for any language, there is a constant-space IPS with two provers where the verifier reads the input once. Additionally, we show that uncountably many binary (unary) languages can be verified in constant space and in linear (quadratic) expected time.

FOS: Computer and information sciencesComputer Science - Computational ComplexityFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheoryComputational Complexity (cs.CC)

researchProduct

A Fast Algorithm Finding the Shortest Reset Words

2012

In this paper we present a new fast algorithm finding minimal reset words for finite synchronizing automata. The problem is know to be computationally hard, and our algorithm is exponential. Yet, it is faster than the algorithms used so far and it works well in practice. The main idea is to use a bidirectional BFS and radix (Patricia) tries to store and compare resulted subsets. We give both theoretical and practical arguments showing that the branching factor is reduced efficiently. As a practical test we perform an experimental study of the length of the shortest reset word for random automata with $n$ states and 2 input letters. We follow Skvorsov and Tipikin, who have performed such a s…

FOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Computer Science - Formal Languages and Automata TheoryComputer Science::Formal Languages and Automata Theory

researchProduct

Quantum Pushdown Automata

2001

Quantum finite automata, as well as quantum pushdown automata (QPA) were first introduced by C. Moore and J. P. Crutchfield. In this paper we introduce the notion of QPA in a non-equivalent way, including unitarity criteria, by using the definition of quantum finite automata of Kondacs and Watrous. It is established that the unitarity criteria of QPA are not equivalent to the corresponding unitarity criteria of quantum Turing machines. We show that QPA can recognize every regular language. Finally we present some simple languages recognized by QPA, not recognizable by deterministic pushdown automata.

FOS: Computer and information sciencesQuantum PhysicsComputer Science - Computational ComplexityFormal Languages and Automata Theory (cs.FL)FOS: Physical sciencesComputer Science - Formal Languages and Automata TheoryComputational Complexity (cs.CC)Quantum Physics (quant-ph)Computer Science::Formal Languages and Automata Theory

researchProduct

Languages with mismatches

2007

AbstractIn this paper we study some combinatorial properties of a class of languages that represent sets of words occurring in a text S up to some errors. More precisely, we consider sets of words that occur in a text S with k mismatches in any window of size r. The study of this class of languages mainly focuses both on a parameter, called repetition index, and on the set of the minimal forbidden words of the language of factors of S with errors. The repetition index of a string S is defined as the smallest integer such that all strings of this length occur at most in a unique position of the text S up to errors. We prove that there is a strong relation between the repetition index of S an…

Combinatorics on wordsApproximate string matchingGeneral Computer ScienceRepetition (rhetorical device)String (computer science)Search engine indexingComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Approximate string matchingData structureTheoretical Computer ScienceCombinatoricsSet (abstract data type)Formal languagesCombinatorics on words Formal languages Approximate string matching IndexingIndexingWord (group theory)MathematicsInteger (computer science)Computer Science(all)Theoretical Computer Science

researchProduct

A Classification of Trapezoidal Words

2011

Trapezoidal words are finite words having at most n+1 distinct factors of length n, for every n>=0. They encompass finite Sturmian words. We distinguish trapezoidal words into two disjoint subsets: open and closed trapezoidal words. A trapezoidal word is closed if its longest repeated prefix has exactly two occurrences in the word, the second one being a suffix of the word. Otherwise it is open. We show that open trapezoidal words are all primitive and that closed trapezoidal words are all Sturmian. We then show that trapezoidal palindromes are closed (and therefore Sturmian). This allows us to characterize the special factors of Sturmian palindromes. We end with several open problems.

FOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)lcsh:Mathematicstrapezoidal words Sturmian words special factors palindromesPalindromeComputer Science - Formal Languages and Automata TheoryDisjoint setslcsh:QA1-939lcsh:QA75.5-76.95PrefixCombinatoricsF.4.3FOS: MathematicsMathematics - CombinatoricsCombinatorics (math.CO)lcsh:Electronic computers. Computer scienceSuffixWord (group theory)Mathematics

researchProduct

The Intersection of $3$-Maximal Submonids

2020

Very little is known about the structure of the intersection of two $k$-generated monoids of words, even for $k=3$. Here we investigate the case of $k$-maximal monoids, that is, monoids whose basis of cardinality $k$ cannot be non-trivially decomposed into at most $k$ words. We characterize the intersection in the case of two $3$-maximal monoids.

Free graphSettore INF/01 - InformaticaGeneral Computer ScienceMathematics::Category Theory3-maximal monoidsMathematics - CombinatoricsComputer Science - Formal Languages and Automata Theory68R15IntersectionTheoretical Computer Science

researchProduct

Finite automata with advice tapes

2013

We define a model of advised computation by finite automata where the advice is provided on a separate tape. We consider several variants of the model where the advice is deterministic or randomized, the input tape head is allowed real-time, one-way, or two-way access, and the automaton is classical or quantum. We prove several separation results among these variants, demonstrate an infinite hierarchy of language classes recognized by automata with increasing advice lengths, and establish the relationships between this and the previously studied ways of providing advice to finite automata.

FOS: Computer and information sciencesTheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGESTheoryofComputation_COMPUTATIONBYABSTRACTDEVICESFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheoryNonlinear Sciences::Cellular Automata and Lattice GasesComputer Science::Formal Languages and Automata Theory

researchProduct